158 research outputs found

    Policy iteration for perfect information stochastic mean payoff games with bounded first return times is strongly polynomial

    Full text link
    Recent results of Ye and Hansen, Miltersen and Zwick show that policy iteration for one or two player (perfect information) zero-sum stochastic games, restricted to instances with a fixed discount rate, is strongly polynomial. We show that policy iteration for mean-payoff zero-sum stochastic games is also strongly polynomial when restricted to instances with bounded first mean return time to a given state. The proof is based on methods of nonlinear Perron-Frobenius theory, allowing us to reduce the mean-payoff problem to a discounted problem with state dependent discount rate. Our analysis also shows that policy iteration remains strongly polynomial for discounted problems in which the discount rate can be state dependent (and even negative) at certain states, provided that the spectral radii of the nonnegative matrices associated to all strategies are bounded from above by a fixed constant strictly less than 1.Comment: 17 page

    Spectral Theorem for Convex Monotone Homogeneous Maps, and Ergodic Control

    Get PDF
    We consider convex maps f:R^n -> R^n that are monotone (i.e., that preserve the product ordering of R^n), and nonexpansive for the sup-norm. This includes convex monotone maps that are additively homogeneous (i.e., that commute with the addition of constants). We show that the fixed point set of f, when it is non-empty, is isomorphic to a convex inf-subsemilattice of R^n, whose dimension is at most equal to the number of strongly connected components of a critical graph defined from the tangent affine maps of f. This yields in particular an uniqueness result for the bias vector of ergodic control problems. This generalizes results obtained previously by Lanery, Romanovsky, and Schweitzer and Federgruen, for ergodic control problems with finite state and action spaces, which correspond to the special case of piecewise affine maps f. We also show that the length of periodic orbits of f is bounded by the cyclicity of its critical graph, which implies that the possible orbit lengths of f are exactly the orders of elements of the symmetric group on n letters.Comment: 38 pages, 13 Postscript figure

    The Operator Approach to Entropy Games

    Get PDF
    Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov

    Tropical Cramer Determinants Revisited

    Full text link
    We prove general Cramer type theorems for linear systems over various extensions of the tropical semiring, in which tropical numbers are enriched with an information of multiplicity, sign, or argument. We obtain existence or uniqueness results, which extend or refine earlier results of Gondran and Minoux (1978), Plus (1990), Gaubert (1992), Richter-Gebert, Sturmfels and Theobald (2005) and Izhakian and Rowen (2009). Computational issues are also discussed; in particular, some of our proofs lead to Jacobi and Gauss-Seidel type algorithms to solve linear systems in suitably extended tropical semirings.Comment: 41 pages, 5 Figure

    The max-plus Martin boundary

    Get PDF
    We develop an idempotent version of probabilistic potential theory. The goal is to describe the set of max-plus harmonic functions, which give the stationary solutions of deterministic optimal control problems with additive reward. The analogue of the Martin compactification is seen to be a generalisation of the compactification of metric spaces using (generalised) Busemann functions. We define an analogue of the minimal Martin boundary and show that it can be identified with the set of limits of ``almost-geodesics'', and also the set of (normalised) harmonic functions that are extremal in the max-plus sense. Our main result is a max-plus analogue of the Martin representation theorem, which represents harmonic functions by measures supported on the minimal Martin boundary. We illustrate it by computing the eigenvectors of a class of translation invariant Lax-Oleinik semigroups. In this case, we relate the extremal eigenvectors to the Busemann points of a normed space.Comment: 37 pages; 8 figures v1: December 20, 2004. v2: June 7, 2005. Section 12 adde

    Hypergraph conditions for the solvability of the ergodic equation for zero-sum games

    Full text link
    The ergodic equation is a basic tool in the study of mean-payoff stochastic games. Its solvability entails that the mean payoff is independent of the initial state. Moreover, optimal stationary strategies are readily obtained from its solution. In this paper, we give a general sufficient condition for the solvability of the ergodic equation, for a game with finite state space but arbitrary action spaces. This condition involves a pair of directed hypergraphs depending only on the ``growth at infinity'' of the Shapley operator of the game. This refines a recent result of the authors which only applied to games with bounded payments, as well as earlier nonlinear fixed point results for order preserving maps, involving graph conditions.Comment: 6 pages, 1 figure, to appear in Proc. 54th IEEE Conference on Decision and Control (CDC 2015

    How to find horizon-independent optimal strategies leading off to infinity: a max-plus approach

    Full text link
    A general problem in optimal control consists of finding a terminal reward that makes the value function independent of the horizon. Such a terminal reward can be interpreted as a max-plus eigenvector of the associated Lax-Oleinik semigroup. We give a representation formula for all these eigenvectors, which applies to optimal control problems in which the state space is non compact. This representation involves an abstract boundary of the state space, which extends the boundary of metric spaces defined in terms of Busemann functions (the horoboundary). Extremal generators of the eigenspace correspond to certain boundary points, which are the limit of almost-geodesics. We illustrate our results in the case of a linear quadratic problem.Comment: 13 pages, 5 figures, To appear in Proc. 45th IEEE Conference on Decision and Contro

    Log-majorization of the moduli of the eigenvalues of a matrix polynomial by tropical roots

    Full text link
    We show that the sequence of moduli of the eigenvalues of a matrix polynomial is log-majorized, up to universal constants, by a sequence of "tropical roots" depending only on the norms of the matrix coefficients. These tropical roots are the non-differentiability points of an auxiliary tropical polynomial, or equivalently, the opposites of the slopes of its Newton polygon. This extends to the case of matrix polynomials some bounds obtained by Hadamard, Ostrowski and P\'olya for the roots of scalar polynomials. We also obtain new bounds in the scalar case, which are accurate for "fewnomials" or when the tropical roots are well separated.Comment: 36 pages, 19 figure
    • …
    corecore